skip to main content


Search for: All records

Creators/Authors contains: "Chao, Yuan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Our world offers a never-ending stream of visual stimuli, yet today’s vision systems only accurately recognize patterns within a few seconds. These systems understand the present, but fail to contextualize it in past or future events. In this paper, we study long-form video understanding. We introduce a framework for modeling long-form videos and develop evaluation protocols on large-scale datasets. We show that existing state-of-the-art short-term models are limited for long-form tasks. A novel object-centric transformer-based video recognition architecture performs significantly better on 7 diverse tasks. It also outperforms comparable state-of-the-art on the AVA dataset. 
    more » « less
  2. null (Ed.)
    Our world offers a never-ending stream of visual stimuli, yet today's vision systems only accurately recognize patterns within a few seconds. These systems understand the present, but fail to contextualize it in past or future events. In this paper, we study long-form video understanding. We introduce a framework for modeling long-form videos and develop evaluation protocols on large-scale datasets. We show that existing state-of-the-art short-term models are limited for long-form tasks. A novel object-centric transformer-based video recognition architecture performs significantly better on 7 diverse tasks. It also outperforms comparable state-of-the-art on the AVA dataset. 
    more » « less
  3. null (Ed.)
  4. null (Ed.)
    Training competitive deep video models is an order of magnitude slower than training their counterpart image models. Slow training causes long research cycles, which hinders progress in video understanding research. Following standard practice for training image models, video model training has used a fixed mini-batch shape: a specific number of clips, frames, and spatial size. However, what is the optimal shape? High resolution models perform well, but train slowly. Low resolution models train faster, but are less accurate. Inspired by multigrid methods in numerical optimization, we propose to use variable mini-batch shapes with different spatial-temporal resolutions that are varied according to a schedule. The different shapes arise from resampling the training data on multiple sampling grids. Training is accelerated by scaling up the mini-batch size and learning rate when shrinking the other dimensions. We empirically demonstrate a general and robust grid schedule that yields a significant out-of-the-box training speedup without a loss in accuracy for different models (I3D, non-local, SlowFast), datasets (Kinetics, Something-Something, Charades), and training settings (with and without pre-training, 128 GPUs or 1 GPU). As an illustrative example, the proposed multigrid method trains a ResNet-50 SlowFast network 4.5 x faster (wall-clock time, same hardware) while also improving accuracy (+ 0.8% absolute) on Kinetics-400 compared to baseline training. Code is available online. 
    more » « less
  5. null (Ed.)
  6. Abstract

    High-degree time-multiplexed multifocal multiphoton microscopy was expected to provide a facile path to scanningless optical-sectioning and the fast imaging of dynamic three-dimensional biological systems. However, physical constraints on typical time multiplexing devices, arising from diffraction in the free-space propagation of light waves, lead to significant manufacturing difficulties and have prevented the experimental realization of high-degree time multiplexing. To resolve this issue, we have developed a novel method using optical fiber bundles of various lengths to confine the diffraction of propagating light waves and to create a time multiplexing effect. Through this method, we experimentally demonstrate the highest degree of time multiplexing ever achieved in multifocal multiphoton microscopy (~50 times larger than conventional approaches), and hence the potential of using simply-manufactured devices for scanningless optical sectioning of biological systems.

     
    more » « less
  7. Abstract

    Human skin progenitor cells will form new hair follicles, although at a low efficiency, when injected into nude mouse skin. To better study and improve upon this regenerative process, we developed an in vitro system to analyse the morphogenetic cell behaviour in detail and modulate physical‐chemical parameters to more effectively generate hair primordia. In this three‐dimensional culture, dissociated human neonatal foreskin keratinocytes self‐assembled into a planar epidermal layer while fetal scalp dermal cells coalesced into stripes, then large clusters, and finally small clusters resembling dermal condensations. At sites of dermal clustering, subjacent epidermal cells protruded to form hair peg‐like structures, molecularly resembling hair pegs within the sequence of follicular development. The hair peg‐like structures emerged in a coordinated, formative wave, moving from periphery to centre, suggesting that the droplet culture constitutes a microcosm with an asymmetric morphogenetic field. In vivo, hair follicle populations also form in a progressive wave, implying the summation of local periodic patterning events with an asymmetric global influence. To further understand this global patterning process, we developed a mathematical simulation using Turing activator‐inhibitor principles in an asymmetric morphogenetic field. Together, our culture system provides a suitable platform to (a) analyse the self‐assembly behaviour of hair progenitor cells into periodically arranged hair primordia and (b) identify parameters that impact the formation of hair primordia in an asymmetric morphogenetic field. This understanding will enhance our future ability to successfully engineer human hair follicle organoids.

     
    more » « less
  8. Abstract Many measurements at the LHC require efficient identification of heavy-flavour jets, i.e. jets originating from bottom (b) or charm (c) quarks. An overview of the algorithms used to identify c jets is described and a novel method to calibrate them is presented. This new method adjusts the entire distributions of the outputs obtained when the algorithms are applied to jets of different flavours. It is based on an iterative approach exploiting three distinct control regions that are enriched with either b jets, c jets, or light-flavour and gluon jets. Results are presented in the form of correction factors evaluated using proton-proton collision data with an integrated luminosity of 41.5 fb -1 at  √s = 13 TeV, collected by the CMS experiment in 2017. The closure of the method is tested by applying the measured correction factors on simulated data sets and checking the agreement between the adjusted simulation and collision data. Furthermore, a validation is performed by testing the method on pseudodata, which emulate various mismodelling conditions. The calibrated results enable the use of the full distributions of heavy-flavour identification algorithm outputs, e.g. as inputs to machine-learning models. Thus, they are expected to increase the sensitivity of future physics analyses. 
    more » « less